Bounds for the Uniform Deviation of Empirical Measures
نویسنده
چکیده
If x, )...) X, are independent identically distributed Rd-valued random vectors with probability measure p and empirical probability measure p,, and if QZ is a subset of the Bore1 sets on Rd, then we show that P{sup,,~ IF,@) -,u(A)( > E) < cs(Q, n*) eCZnr2, where c is an explicitly given constant, and s(W, n) is the maximum over all (x, ,..., XJ E Rd” of the number of different sets in {lx , ,..., x.} nA 1 A E a). The bound strengthens a result due to Vapnik and Chervonenkis.
منابع مشابه
Uniform Deviation Bounds for Unbounded Loss Functions like k-Means
Uniform deviation bounds limit the difference between a model’s expected loss and its loss on an empirical sample uniformly for all models in a learning problem. As such, they are a critical component to empirical risk minimization. In this paper, we provide a novel framework to obtain uniform deviation bounds for loss functions which are unbounded. In our main application, this allows us to ob...
متن کاملUniform Deviation Bounds for k-Means Clustering
Uniform deviation bounds limit the difference between a model’s expected loss and its loss on a random sample uniformly for all models in a learning problem. In this paper, we provide a novel framework to obtain uniform deviation bounds for unbounded loss functions. As a result, we obtain competitive uniform deviation bounds for k-Means clustering under weak assumptions on the underlying distri...
متن کاملThe false discovery rate for statistical pattern recognition
Abstract: The false discovery rate (FDR) and false nondiscovery rate (FNDR) have received considerable attention in the literature on multiple testing. These performance measures are also appropriate for classification, and in this work we develop generalization error analyses for FDR and FNDR when learning a classifier from labeled training data. Unlike more conventional classification perform...
متن کاملGeneralization Bounds for the Area Under the ROC Curve
We study generalization properties of the area under the ROC curve (AUC), a quantity that has been advocated as an evaluation criterion for the bipartite ranking problem. The AUC is a different term than the error rate used for evaluation in classification problems; consequently, existing generalization bounds for the classification error rate cannot be used to draw conclusions about the AUC. I...
متن کاملGeneralization Bounds for the Area Under an ROC Curve
We study generalization properties of the area under an ROC curve (AUC), a quantity that has been advocated as an evaluation criterion for bipartite ranking problems. The AUC is a different and more complex term than the error rate used for evaluation in classification problems; consequently, existing generalization bounds for the classification error rate cannot be used to draw conclusions abo...
متن کامل